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The ability of membrane proteins to sense and respond- 
to changes in membrane voltage is critical for a vast 
array of biological processes, from generating and prop- 
agating the nerve impulse to excitation-secretion cou- 
pling. Hodgkin and Huxley (1952) were the first to 
characterize voltage-activated potassium (Kv) and so- 
dium (Nav) currents, but the common architecture of 
voltage-activated cation channels only became apparent 
after the genes encoding these proteins were identified 
(Noda et al., 1986; Tanabe et al., 1987; Timpe et al., 
1988). We now appreciate that the voltage sensitivity of 
these channels can be ascribed to modular voltage- 
sensing domains comprised of S1-S4 helices (Lu et al., 
2001, 2002; Jiang etal., 2003; Long etal., 2007; Bezanilla, 
2008; Swartz, 2008) . Bioinformatic searches subsequendy 
identified S1-S4 voltage-sensing domains in other pro- 
tein families, including voltage-sensitive phosphatases 
(VSPs; Kumanovics et al., 2002; Murata et al., 2005), 
where the S1-S4 domain regulates an intracellular en- 
zyme, and voltage-activated proton channels (Hvl; 
Ramsey et al., 2006; Sasaki et al., 2006), in which the 
S1-S4 domain forms a stand-alone pore. Although the 
sequences of S1-S4 domains vary considerably, the mech- 
anisms of these domains appear to be so highly conserved 
that Kv channels from humans and hyperthermophilic 
archebacteria are both sensitive to voltage sensor toxins 
from tarantula venom (Swartz, 2008). S1-S4 domains 
also adopt similar structures in the few Kv and Nav 
channels that have been successfully crystalized (Jiang 
et al., 2003; Long et al., 2007; Payandeh et al., 2011), 
suggesting that there is a common blueprint, or design 
principle, for constructing a voltage sensor. In this issue 
of The Journal of General Physiology, Palovcak et al. 
computationally analyze the thousands of examples of 
S1-S4 domains present in all three kingdoms of life 
to identify the key design features common to all 
S1-S4 domains. 

Palovcak et al. (2014) begin by identifying S1-S4 
domain sequences in the National Center for Biotech- 
nology Information database using a hidden Markov 
model (HMM; Eddy, 2004) trained on an initial set, or 
seed, of well known, phylogenetically diverse sequences. 
This training seed allows the HMM to estimate the 
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probability distribution of amino acids at each position, 
and because the HMM accounts for differences in evo- 
lutionary pressure, it is typically better able to detect 
distandy related sequences than a simple BLAST (Basic 
Local Alignment Search Tool) search. Additionally, an 
HMM can automatically provide the most probable 
alignment, and its likelihood, for each residue in the 
sequence, which is especially helpful when the align- 
ment appears ambiguous. Using this approach, the 
authors' HMM identified >6,600 sequences from all 
known branches of the family of S1-S4 domains and, 
after grouping similar sequences, were left with 3,821 
effectively independent sequences, a colossal number 
compared with those obtained through previous multi- 
ple sequence alignment (MSA) analyses of voltage-acti- 
vated ion channels. 

One benefit of simultaneously comparing a large 
number of evolutionarily diverse sequences is that the 
ambiguity of individual alignments can be directly 
quantified. For S1-S4 domains, the alignment of S2 and 
S3 is relatively clear because of the highly conserved 
acidic residues; however, published binary sequence 
alignments have differed considerably for SI and S4 
(Lacroix and Bezanilla, 2012; Mishina et al., 2012; 
Kulleperuma et al., 2013). Aligning S4, the helix that 
moves in response to changes in membrane voltage, has 
been particularly difficult because this helix contains a 
repeating triad of Arg and two hydrophobic residues 
(e.g., ArgXXArgXXArgXXArgXX) that can vary in 
length between three and six Arg residues (and can 
even contain gaps) . Thus, whenever two S4 helices have 
different numbers of Arg residues, there will be several 
different registers (each shifted by three residues) that 
equally optimize the number of aligned Arg residues 
(Kulleperuma et al., 2013). However, the authors dem- 
onstrate that because an HMM weights each position by 
its variability, one can define a most-probable alignment 
for S4 and compare it with other alignments in terms 
of the likelihood of finding each specific residue at 
a particular position (called a posterior probability; 
Wolfsheimer et al., 2012) . In Fig. 1 A, we show the output 
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Figure 1. Sequence alignment of S1-S4 voltage-sensing domains. (A) Sequence alignment from the output MSA of Palovcak et al. 
(2014), including Shaker Kv (UniProt accession no. P08510), Kvl. 2/2.1 paddle chimera (Protein Data Bank accession no. 2R9R_B), 
rat Kv2.1 (UniProt accession no. P15387), rat Kvl.2 (UniProt accession no. P63142), human Kvll.l (UniProt accession no. Q12809), 
KvAP (UniProt accession no. Q9YDF8), human Hvl (UniProt accession no. Q96D96), Ciona VSP (UniProt accession no. Q4W8A1), 
and NavAb (UniProt accession no. A8EVM5). Conserved positions are colored as follows: polar, green; hydrophobic, black; acidic, red; 
and basic, blue. Positions highlighted in yellow have not been extensively studied previously but are identified as highly conserved by 
Palovcak et al. (2014). Helices are positioned according to the structure of the Kvl. 2/2.1 paddle chimera (2R9R; Long et al., 2007). 
(B) Alternative alignment of SI that takes into consideration the longer SI helix in Kv channels (Long et al., 2007) compared with Nav 
channels (Payandeh et al., 2011). In this alignment, positions 11 and 14 in NavAb are equally or more highly conserved than in A, and 
position 15 becomes highly conserved. 



MSA from Palovcak et al. (2014) with representatives 
from several branches of the family, including the 
Shaker (Timpe et al., 1988) and Kv2.1 channels, for 
which extensive functional data exist (Bezanilla, 2008; 



Swartz, 2008); the Kvl. 2/2.1 paddle chimera, the only 
eukaryotic voltage-activated channel for which an x-ray 
structure has been solved (Long et al., 2007); and 
NavAb, a Nav channel from Arcobacter buzleri for which 
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an x-ray structure has been solved (Payandeh et al., 
2011) and that the authors use as a reference. This out- 
put MSA clearly provides a valuable starting point for 
comparisons between any two S1-S4 domains, in par- 
ticular those where x-ray structures are not available for 
a closely related S1-S4 domain. 

Intuitively, the most striking feature of any MSA is the 
degree of conservation at particular positions. Indeed, 
previous studies have identified a host of critical resi- 
dues within S1-S4 domains, including the highly con- 
served periodic motif of basic residues within the S4 
helix (Fig. 1 A, blue residues), acidic residues in S1-S3 
that serve as stabilizing countercharges for the S4 Arg 
residues (Fig. 1 A, red residues), and highly conserved 
bulky hydrophobic residues near the middle of the SI— 
S4 domain that S4 Arg residues move past as the domain 
changes conformation between resting and activated 
states (Fig. 1 A, black residues; Noda et al., 1986; Tanabe 
et al., 1987; Timpe et al., 1988; Papazian et al., 1995; 
Aggarwal and MacKinnon, 1996; Seoh etal., 1996; Jiang 
et al., 2003; Long et al., 2007; Bezanilla, 2008; Swartz, 
2008; Tao et al., 2010; Lacroix and Bezanilla, 2011). 
In Palovcak et al. (2014), the authors take an unbiased 
approach by calculating the Kullback-Leibler diver- 
gence (Dkl) for each position in their large MSA. Essen- 
tially, Dkl compares the distribution of amino acids at 
each position with those typically found in that environ- 
ment (e.g., a lipid-facing inner or outer membrane 
interface) , thereby identifying sites that are evolution- 
arily constrained. For example, position 25 in the SI of 
NavAb has a high Dkl because it is commonly occupied 
by polar residues (Asn and Ser) , which are uncommon 
in the middle of the membrane. In Hv channels, this 
position (D112) plays a crucial role in proton conduc- 
tion (Musset et al., 2011). In the end, this analysis iden- 
tifies 21 positions within S1-S4 domains that have high 
scores and thus have been subjected to particularly 
strong evolutionary pressure. Reassuringly, 13 out of 21 
of these positions have been previously shown to play 
critical roles in the hydrophobic core of the domain, 
the acidic residue clusters that stabilize S4, or the peri- 
odic ArgXX motif within S4 (Fig. 1 A) . Most of the eight 
new positions identified with this approach are located 
in the intracellular half of S1-S3 (e.g., residues 11, 14, 
63, 71, 74, 76, and 77 in NavAb) and are typically occu- 
pied by polar, aromatic, and positively charged residues. 
Although the precise mechanistic significance of these 
positions remains unclear, the authors' analysis strongly 
suggests that further investigation is warranted. 

Palovcak et al. (2014) then use their gargantuan MSA 
to search for key interactions within S1-S4 domains by 
identifying positions undergoing coevolution. Concep- 
tually, the idea behind this analysis is straightforward. 
If two positions, A and B, make an essential interaction 
in any conformation, variations in the amino acid at po- 
sition A (e.g., acidic to polar) should be correlated with 



variations at position B (e.g., basic to polar) . It is impor- 
tant to note that there is not a simple link between pairs 
of residues that exhibit direct structural interactions 
and those that coevolve. First, two positions can have 
a very strong structural interaction but show no signs 
of coevolution if one or both positions are invariant. 
For example, there is strong experimental evidence that 
Arg residues in the S4 helix of Kv channels interact 
strongly with the charge-transfer center (F56 in NavAb 
and F290 in Shaker; Tao et al., 2010; Lacroix and 
Bezanilla, 2011), but the Arg residues and F290 vary so 
infrequently that they show no sign of coevolution. 
Conversely, two positions that play critical roles in stabi- 
lizing the same state could undergo coevolution even if 
there is no direct structural interaction between them. 
However, coevolution is strongly suggestive of a struc- 
tural interaction between two positions, and to detect 
such sites, the authors performed a direct-coupling 
analysis (DCA), which quantifies the degree to which 
variations of the amino acid at one position are corre- 
lated with a second. 

Using this approach, the authors uncovered 24 pairs 
of residues within the S1-S4 domains that are strongly 
coupled and mapped these pairs onto the activated- 
state structure of NavAb (Payandeh et al., 2011). For read- 
ers more familiar with the structure of the Kvl. 2/2.1 
paddle chimera (Long et al., 2007), we provide the 
equivalent map onto that structure (Fig. 2). Of the top 
24 pairs of residues identified, 20 are positioned to in- 
teract directly in the NavAb structure, and 19 are so po- 
sitioned in the paddle chimera structure. Most of these 
coupled pairs of residues are positioned between SI 
and S2 (Fig. 2, dashed blue lines) or between S2 and S3 
(Fig. 2, dashed magenta lines), and only two pairs are 
between S4 and the other three helices. In effect, this 
DCA supports the idea that S1-S3 forms a relatively sta- 
tionary scaffold against which the S4 helix moves as it 
changes conformation between resting and activated 
states. Inspection of the contact maps for the structures 
of NavAb and the Kvl .2/2.1 paddle chimera reveals a 
large number of interactions between the S1-S3 heli- 
ces, supporting this idea. Those structures also show nu- 
merous contacts between S4 and the S1-S3 helices, 
many of which are not seen in the DCA. One explana- 
tion for this apparent discrepancy might be that some 
of the interactions between S4 and the other three heli- 
ces are invariant between S1-S4 domains in the MSA 
and thus cannot be seen in the DCA. Stabilizing inter- 
actions between Arg residues in S4 and acidic residues in 
S1-S3 are likely candidates for the types of interactions 
that DCA might miss. It is also possible that some inter- 
actions between S4 and S1-S3 differ between subfami- 
lies of S1-S4 domains and that the DCA cannot identify 
these because it was performed on a large and diverse 
MSA of all known S1-S4 domains. If subfamily-specific 
interactions between S4 and S1-S3 do exist, they are 
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Figure 2. Mapping of evolutionarily coupled residues onto the x-ray structure of the Kvl.2/Kv2.1 paddle chimera. The Ca atoms of all 
coupled residues are identified with small spheres, and S4 Arg residues are shown as sticks. 19 coupled pairs that are positioned to in- 
teract directly are connected with dashed blue lines between SI and S2 and with dashed magenta lines between S2 and S3. Five coupled 
pairs whose positions in the structure are not compatible with direct structural interactions are connected with dashed green lines. The 
24 coupled pairs are those listed in Table S3 in Palovcak et al. (2014). 



unlikely to be essential because the paddle motif, a helix- 
turn-helix motif composed of S3b and S4 helices, can be 
transplanted between many different types of proteins 
that contain S1-S4 domains, including Kv channels, Nav 
channels, Hvl channels, and VSPs, without disrupting 
voltage-sensing functions (Alabi et al., 2007; Bosmans 
et al., 2008) . Moreover, it has also recently been reported 
that coexpression of constructs encoding the N terminus 
through S3 of the Shaker Kv channel with those encod- 
ing S4 through the C terminus gives rise to functional 
voltage sensors (Priest etal., 2013). 

At present, the available x-ray structures of Kv and 
Nav channels have provided a detailed picture of the 
activated state of the S1-S4 domains in these channels, 
but we currently lack structures of these proteins in the 
resting states that are populated at negative membrane 
voltages where the channels are closed. Although most 
of the pairs of coevolving residues identified in the DCA 
coupling results could plausibly interact directly in the 
activated state structures, several are too far apart, rais- 
ing the possibility that they interact in the resting state. 
Two of these coevolving pairs involve the first Arg posi- 
tion of the S4 helix (E96 in NavAb and R362 in Shaker) ; 
in the first pair, this position couples with N25 in NavAb 
(S240 in Shaker) within the SI helix, and in the second 
it couples with N49 in NavAb (E283 in Shaker) within 
the S2 helix. Cd 2+ bridging experiments in the Shaker 
Kv channel have shown that R262C can bridge with ei- 
ther I241C in SI or I287C in S2, and in both cases the 



bridges form in the resting state (Campos et al., 2007) . 
These bridging residues would be nearby those identi- 
fied in the coupling analysis; in the case of the S4 bridge 
with SI, the coupling analysis and Cd" + bridge differ by 
one residue within SI, and in the case of the S4 bridge 
with S2, the two approaches differ by one turn of the S2 
helix. In addition, Palovcak et al. (2014) also point out 
that these two coevolving pairs are compatible with sev- 
eral computational models (Vargas et al., 2012) for the 
resting states of Kv and Nav channels. 

The predictive power of the authors' analysis of se- 
quence conservation and coevolution will be clearer 
after the functional impact of their newly identified 
conserved residues and coevolving pairs has been inves- 
tigated experimentally and x-ray structures of S1-S4 
voltage-sensing domains in resting states have been 
solved. However, it is reassuring that established struc- 
tural features of S1-S4 domains, such as the S4 Arg resi- 
dues, acidic countercharges, and hydrophobic core, 
appear naturally. The lack of coevolving pairs between 
S4 and SI -S3 is also largely consistent with our current 
understanding. The present DCA does not detect sev- 
eral previously defined structural interactions between 
elements within S1-S4 domains. For instance, this analy- 
sis did not detect interactions between S4 Arg residues 
and residues in either the charge-transfer center or 
the acidic residue clusters, likely because the partici- 
pating residues are too highly conserved. The present 
DCA also failed to detect interactions between the outer 
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portions of S4 with S3b, where the S3b helix has been 
shown to interact with S4 in the activated state and serve 
as a hydrophobic stabilizer of the S4 helix (Xu et al., 
2013) . In this instance, the interactions may be too non- 
specific (i.e., hydrophobic interactions) to be detected 
using DCA. Indeed, these examples nicely illustrate the 
types of structural features MSA analyses can detect and 
those that must be found by other methods. 

In Palovcak et al. (2014), the authors have lumped 
together all known S1-S4 domains, which is reasonable 
when the goal is to find universal common features. 
Although S1-S4 domains share common structural 
features and mechanisms, it is likely there will be im- 
portant differences between subfamilies. For example, 
comparison of the structures of the Kvl. 2/2.1 paddle 
chimera and NavAb reveals that the SI helix is one 
helical turn longer in Kv channels compared with Nav 
channels. As a result, an evolutionarily conserved pair- 
ing between residues 9 and 14 in NavAb are posi- 
tioned to interact locally in that x-ray structure, but 
on opposite sides of the SI helix in the structure of 
the paddle chimera (Fig. 2, left). If we introduce this 
difference into the authors' MSA for SI (Fig. 1 B), 
this pair of residues can interact locally (not de- 
picted), the conservation of two positions identified 
by Palovcak et al. (2014) improves, and a neighboring 
position also becomes highly conserved (Fig. 1 B, gray 
shading) . It would be valuable to undertake compara- 
ble analyses specifically comparing different subfami- 
lies of S1-S4 domains to correlate sequence differences 
with functional specialization. For example, a conserved 
Asp in SI of Hvl is required for proton selectivity 
(Musset et al., 201 1) , yet there must be additional crit- 
ical adaptations because that position is conserved in 
VSPs that do not conduct protons. It would also be 
interesting to compare S1-S4 domains from voltage- 
activated channels with those found in CNG and tran- 
sient receptor potential (TRP) channels, two types of 
tetrameric cation channels that lack strong voltage 
sensitivity. At least in TRPV1 channels, the S1-S4 domain 
adopts a similar fold to that discussed here, and it 
does not appear to change conformation as the chan- 
nel opens and closes in response to activating ligands 
(Cao et al., 2013). One might predict that there would 
be a larger number of coevolving residues within the 
S1-S4 domains of TRP channels and that more of 
these would occur between the S4 helix and S1-S3. 
Finally, could such an analysis shed light on how con- 
formational changes in S1-S4 domains couple to and 
control the conformation of the pore domain in volt- 
age-activated cation channels? Interactions between 
the S4-S5 linker and the C-terminal end of S6 are 
known to be crucial for coupling voltage-sensing and 
pore domains (Lu et al., 2001, 2002), and the essen- 
tial differences between channels that are activated by 
membrane depolarization compared with those activated 



by hyperpolarization may reside in this region (Kwan 
et al., 2012). However, the sequences in these regions 
vary considerably, and the underlying mechanisms of 
coupling voltage-sensing and pore domains remain to 
be uncovered. 
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